Incremental construction of minimal acyclic finite-state automata

نویسندگان

  • Jan Daciuk
  • Stoyan Mihov
  • Bruce W. Watson
  • Richard Watson
چکیده

In this paper, we describe a new method for constructing minimal, deterministic, acyclic finitestate automata from a set of strings. Traditional methods consist of two phases: the first to construct a trie, the second one to minimize it. Our approach is to construct a minimal automaton in a single phase by adding new strings one by one and minimizing the resulting automaton on-the-fly. We present a general algorithm as well as a specialization that relies upon the lexicographical ordering of the input strings. Our method is fast and significantly lowers memory requirements in comparison to other methods.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Incremental Construction of Compact Acyclic NFAs

This paper presents and analyzes an incremental algorithm for the construction of Acyclic Nondeterministic Finite-state Automata (NFA). Automata of this type are quite useful in computational linguistics, especially for storing lexicons. The proposed algorithm produces compact NFAs, i.e. NFAs that do not contain equivalent states. Unlike Deterministic Finite-state Automata (DFA), this property ...

متن کامل

Incremental Construction of Minimal Acyclic Sequential Transducers from Unsorted Data

This paper presents an efficient algorithm for the incremental construction of a minimal acyclic sequential transducer (ST) for a dictionary consisting of a list of input and output strings. The algorithm generalises a known method of constructing minimal finite-state automata (Daciuk et al., 2000). Unlike the algorithm published by Mihov and Maurel (2001), it does not require the input strings...

متن کامل

Incremental Construction of Minimal Sequential Transducers

This paper presents an efficient algorithm for the incremental construction of a minimal acyclic sequential transducer (ST) from a list of input and output strings. The algorithm generalizes a known method of constructing minimal finite-state automata (Daciuk, Mihov, Watson and Watson 2000). Unlike the algorithm published by Mihov and Maurel (2001), it does not require the input strings to be s...

متن کامل

Comments on "Incremental Construction and Maintenance of Minimal Finite-State Automata, " by Rafael C. Carrasco and Mikel L. Forcada

In a recent article, Carrasco and Forcada (June 2002) presented two algorithms: one for incremental addition of strings to the language of a minimal, deterministic, cyclic automaton, and one for incremental removal of strings from the automaton. The first algorithm is a generalization of the “algorithm for unsorted data”—the second of the two incremental algorithms for construction of minimal, ...

متن کامل

Incremental Construction and Maintenance of Minimal Finite-State Automata

Daciuk et al. [Computational Linguistics 26(1):3–16 (2000)] describe a method for constructing incrementally minimal, deterministic, acyclic finite-state automata (dictionaries) from sets of strings. But acyclic finite-state automata have limitations: For instance, if one wants a linguistic application to accept all possible integer numbers or Internet addresses, the corresponding finitestate a...

متن کامل

Incremental Construction Of Minimal Acyclic Finite State Automata And Transducers

In this paper, we describe a new method for constructing minimal, deterministic, acyclic finite state automata and transducers. Traditional methods consist of two steps. The first one is to construct atrie, the second one -to perform minimization. Our approach is to construct an automaton in a single step by adding new strings one by one and minimizing the resulting automaton on-the-fly. We pre...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Computational Linguistics

دوره 26  شماره 

صفحات  -

تاریخ انتشار 2000